Enriched by natural language texts, Stack Overflow code snippets are aninvaluable code-centric knowledge base of small units of source code. Besidesbeing useful for software developers, these annotated snippets can potentiallyserve as the basis for automated tools that provide working code solutions tospecific natural language queries. With the goal of developing automated tools with the Stack Overflow snippetsand surrounding text, this paper investigates the following questions: (1) Howusable are the Stack Overflow code snippets? and (2) When using text searchengines for matching on the natural language questions and answers around thesnippets, what percentage of the top results contain usable code snippets? A total of 3M code snippets are analyzed across four languages: C\#, Java,JavaScript, and Python. Python and JavaScript proved to be the languages forwhich the most code snippets are usable. Conversely, Java and C\# proved to bethe languages with the lowest usability rate. Further qualitative analysis onusable Python snippets shows the characteristics of the answers that solve theoriginal question. Finally, we use Google search to investigate the alignmentof usability and the natural language annotations around code snippets, andexplore how to make snippets in Stack Overflow an adequate base for futureautomatic program generation.
展开▼